Mine Rule

نویسنده

  • Giuseppe Psaila
چکیده

Mining of association rules is one of the most adopted techniques for data mining in the most widespread application domains. A great deal of work has been carried out in the last years on the development of efficient algorithms for association rules extraction. Indeed, this problem is a computationally difficult task, known as NP-hard (Calders, 2004), which has been augmented by the fact that normally association rules are being extracted from very large databases. Moreover, in order to increase the relevance and interestingness of obtained results and to reduce the volume of the overall result, constraints on association rules are introduced and must be evaluated (Ng et al.,1998; Srikant et al., 1997). However, in this contribution, we do not focus on the problem of developing efficient algorithms but on the semantic problem behind the extraction of association rules (see Tsur et al. [1998] for an interesting generalization of this problem). We want to put in evidence the semantic dimensions that characterize the extraction of association rules; that is, we describe in a more general way the classes of problems that association rules solve. In order to accomplish this, we adopt a general-purpose query language designed for the extraction of association rules from relational databases. The operator of this language, MINE RULE, allows the expression of constraints, constituted by standard SQL predicates that make it suitable to be employed with success in many diverse application problems. For a comparison between this query language and other state-of-the-art languages for data mining, see Imielinski, et al. (1996); Han, et al. (1996); Netz, et al. (2001); Botta, et al. (2004). In Imielinski, et al. (1996), a new approach to data mining is proposed, which is constituted by a new generation of databases called Inductive Databases (IDBs). With an IDB, the user/analyst can use advanced query languages for data mining in order to interact with the knowledge discovery (KDD) system, extract data mining descriptive and predictive patterns from the database, and store them in the database. Boulicaut, et al. (1998) and Baralis, et al. (1999) discuss the usage of MINE RULE in this context. We want to show that, thanks to a highly expressive query language, it is possible to exploit all the semantic possibilities of association rules and to solve very different problems with a unique language, whose statements are instantiated along the different semantic dimensions of the same application domain. We discuss examples of statements solving problems in different application domains that nowadays are of a great importance. The first application is the analysis of a retail data, whose aim is market basket analysis (Agrawal et al., 1993) and the discovery of user profiles for customer relationship management (CRM). The second application is the analysis of data registered in a Web server on the accesses to Web sites by users. Cooley, et al. (2000) present a study on the same application domain. The last domain is the analysis of genomic databases containing data on micro-array experiments (Fayyad, 2003). We show many practical examples of MINE RULE statements and discuss the application problems that can be solved by analyzing the association rules that result from those statements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determination of Ore/Waste Boundary Using Indicator Kriging, Case Study: Choghart Iron Mine of Iran

Estimation of ore reserves is one of the most critical aspects of mining geology. The accurate assessment of the tonnage and grade of run of mine may be the difference between a healthy profitable operation and an expensive early mine closure. The first step in ore reserve estimation is to determine the boundary of ore body or ore/waste contacts. This paper presents a specific mining applicatio...

متن کامل

The Development of Internet of Things in Coal Mine Based on Rough Set

This paper analyzes the coal mine of things to be solved key technical problems in order to improve things in the coal mine the diagnostic accuracy of fan failure, the use of mechanical failure analysis UCI database data sets for analysis. According to the coal mine of things to determine the characteristics of the fan through the vertical amplitude of the frequency, the base rate of vertical v...

متن کامل

A Novel Algorithm for Association Rule Mining from Data with Incomplete and Missing Values

Missing values and incomplete data are a natural phenomenon in real datasets. If the association rules mine incomplete disregard of missing values, mistaken rules are derived. In association rule mining, treatments of missing values and incomplete data are important. This paper proposes novel technique to mine association rule from data with missing values from large voluminous databases. The p...

متن کامل

Association Rules Mining Algorithm

Many algorithms to mine the association rules are divided into two stages, the first is to find the frequent set; the second is use the frequent set to generate association rules. This proposal discuss the respective characteristics and .shortcoming of the current algorithms to mine association rules and propose another method to mine faster; unlike the other algorithms, this algorithm emphasis...

متن کامل

Mining Closed Strong Association Rules by Rule-growth in Resource Effectiveness Matrix

Association rules mining approach can find the relationship among items. Using association rules mining algorithm to mine resource fault, can reduce the number of wrong alarm resources to be replaced. This paper proposed an efficient association rules mining algorithm: CSRule, for mining closed strong association rules based on association rule merging strategies. CSRule algorithm adopts severa...

متن کامل

Mining of Frequent Itemsets with JoinFI-Mine Algorithm

Association rule mining among frequent items has been widely studied in data mining field. Many researches have improved the algorithm for generation of all the frequent itemsets. In this paper, we proposed a new algorithm to mine all frequents itemsets from a transaction database. The main features of this paper are: (1) the database is scanned only one time to mine frequent itemsets; (2) the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015